{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# MNIST Example Neutral Network in TensorFlow\n",
    "\n",
    "To really dive into AI, we need to use one of the many frameworks. TensorFlow is a machine learning library from Google, and is the most well known and widely used to do this kind of work. It has several layers, allowing you to get arbitrarily deep into the weeds when it comes to coding machine learning code. However, for the purposes of this demo, we will stay at a fairly high level, using the packaged \"Keras\" library. Keras lets us look at neural networks in terms of layers of nodes, and is generally easy to work with for new users. It doesn't require a lot of the advanced math that some lower layers might need, instead just requiring a general understanding of when to apply certain techniques\n",
    "\n",
    "In this demo, we will be using TensorFlow's Keras library to work on the MNIST dataset from before. While it is possible for non-AI code to do things like handwritten digit classification, AI is currently the state of the art. It also has the secondary benefit of being significantly easier in some cases to program. Some other approaches involve decision trees or support vector machines (https://en.wikipedia.org/wiki/Support-vector_machine)\n",
    "\n",
    "This notebook will cover some of the data preparation required, training the AI, and evaluating the AI"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To start with, we have our common imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-15T19:10:42.931287Z",
     "start_time": "2021-03-15T19:10:41.256164Z"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2021-12-10 03:21:03.552911: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "import tensorflow as tf\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Since this is a new notebook, we will need to load the TensorFlow data again"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-15T19:10:50.012987Z",
     "start_time": "2021-03-15T19:10:46.731625Z"
    }
   },
   "outputs": [],
   "source": [
    "train_df: pd.DataFrame = pd.read_csv('Resources/mnist_train.csv', header=None)\n",
    "test_df: pd.DataFrame = pd.read_csv('Resources/mnist_test.csv', header=None)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will split the dataset in the same way, separating the features from the labels"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-15T19:10:51.940752Z",
     "start_time": "2021-03-15T19:10:51.935375Z"
    }
   },
   "outputs": [],
   "source": [
    "train_features: np.ndarray = train_df.loc[:, 1:].values"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next we need to unpack the features here into a 4d data structure. This is necessary because of the AI we will be using later in the notebook. The AI we use will have a convolutional layer. This layer is explained in more detail below, but it is very good at extracting information from images. However to use it we need to format the data in the way that it expects. The convolutional layer expects each input to be a 3d array, containing a set of pixels arranged by width and height, and then for each element in that matrix to be an array of 1 to 3 elements. In the case of our dataset, it's all greyscale, so we'll have a single element in each array but it could also be a set of three representing an \\[R, G, B\\] pixel value\n",
    "\n",
    "In our case, we simply reshape it into 60,000 28x28x1 arrays"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-15T19:10:55.866585Z",
     "start_time": "2021-03-15T19:10:55.864039Z"
    }
   },
   "outputs": [],
   "source": [
    "train_features = train_features.reshape((train_features.shape[0], 28, 28, 1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next we need to normalize the data. This is a very important step for two reasons: One it helps the model learn faster when the inputs are in the range \\[0, 1\\], and two it helps prevent the vanishing/exploding gradients problem in certain neural networks\n",
    "\n",
    "The exact nature of the vanishing/exploding gradient problem is out of the scope of this demo, but you can find some information on the nature of the problem here: https://medium.com/learn-love-ai/the-curious-case-of-the-vanishing-exploding-gradient-bf58ec6822eb\n",
    "\n",
    "To normalize the data, we simply divide it by the maximum value: 255 in our case, since we know the data is in the range \\[0, 255\\]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-15T19:10:58.776066Z",
     "start_time": "2021-03-15T19:10:58.610898Z"
    }
   },
   "outputs": [],
   "source": [
    "train_features = train_features / 255.0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we split out the labels"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-15T19:11:01.386654Z",
     "start_time": "2021-03-15T19:11:01.384067Z"
    }
   },
   "outputs": [],
   "source": [
    "train_labels: np.ndarray = train_df[0].values"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We do the same preprocessing for the test dataset\n",
    "\n",
    "It's important to understand why we have a separate dataset for training and testing. Neural networks are designed to \"learn\" associations in data by looking at large sets of data. However, it can also learn some bad habits, for example if all of the digits were written by someone right-handed, it may learn habits associated with right handed digits and perform poorly when it comes to digits written with the left hand. This learning of peculiarities of a given sample of data is called \"overfitting\". It's a problem that arises when a training dataset doesn't fully accurately reflect reality. It is related to, but not the same as having a \"biased\" AI. There are several examples in the news lately of AI having biases for all sorts of reasons. One good way to help ensure good AI performance in terms of overfitting is to ensure that it can perform well on data that it hasn't seen before. To do this, we separate some of the data into a test dataset that is only used to evaluate the performance of the AI, and not to train it. This is only one tool, and bias and overfitting can occur in many ways, but it's always good practice to evaluate the AI on a test dataset to ensure it hasn't overfit to the training dataset.\n",
    "\n",
    "You can read some more about bias in AI with a google search, but a good summary of some of the problems can be read at: https://www.technologyreview.com/s/612876/this-is-how-ai-bias-really-happensand-why-its-so-hard-to-fix/"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-15T19:11:04.680977Z",
     "start_time": "2021-03-15T19:11:04.658922Z"
    }
   },
   "outputs": [],
   "source": [
    "test_features: np.ndarray = test_df.loc[:, 1:].values\n",
    "test_features = test_features.reshape((test_features.shape[0], 28, 28, 1))\n",
    "test_features = test_features / 255.0\n",
    "test_labels: np.ndarray = test_df[0].values"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we finally create our model\n",
    "\n",
    "This model is a Sequential model, meaning that each layer sends its outputs to all inputs of the layer that follows it\n",
    "\n",
    "We will add several layers into this model, along with some explanations about why these certain layers are good to use when solving certain problems, however there are many combinations that could possibly work. Building good intuition about when and how to use certain types of AI is going to be important in ensuring that your AIs perform well in the future"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-15T19:11:08.144791Z",
     "start_time": "2021-03-15T19:11:08.115016Z"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2021-12-10 03:21:41.162413: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set\n",
      "2021-12-10 03:21:41.162620: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
      "2021-12-10 03:21:41.162635: W tensorflow/stream_executor/cuda/cuda_driver.cc:326] failed call to cuInit: UNKNOWN ERROR (303)\n",
      "2021-12-10 03:21:41.162666: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (jupyterhub-nb-areznik): /proc/driver/nvidia/version does not exist\n",
      "2021-12-10 03:21:41.162903: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA\n",
      "To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
      "2021-12-10 03:21:41.164445: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set\n"
     ]
    }
   ],
   "source": [
    "model: tf.keras.models.Sequential = tf.keras.models.Sequential()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Conv2D\n",
    "A layer performing convolution over a 2d input\n",
    "\n",
    "Convolution is one of the most important techniques in modern AI, it's built on top of Fourier transformations, and is currently the state of the art when it comes to image analysis. In a basic convolution, one takes a small snapshot of the pixels, and examines how they blend together and applying a filter to strengthen or weaken the effect. This is done over the entire image, which allows things like edges to be strengthened or unimportant parts of the image to be blurred.\n",
    "\n",
    "Deeper understanding of this layer requires quite a bit of math, but an excellent primer can be found at: https://timdettmers.com/2015/03/26/convolution-deep-learning/\n",
    "\n",
    "Note that since this is the first layer, we also can specify input_shape, to help TensorFlow understand the shape of the input"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-15T19:11:11.374584Z",
     "start_time": "2021-03-15T19:11:11.345870Z"
    }
   },
   "outputs": [],
   "source": [
    "model.add(tf.keras.layers.Conv2D(filters=32, kernel_size=(3, 3), input_shape=(28, 28, 1)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### MaxPooling2D\n",
    "A layer pooling the max value in a 2d grid\n",
    "\n",
    "This is a downsampling method, which allows a larger image to be downsampled by the max value in a given grid. In our previous layer, we modified the image to emphasize the important parts of the image (edges, spaces, etc). Now, to make the image easier to process, we take small grids (2x2 in this case), and take the max value from inside that grid and pass it on to the next layer. This helps get rid of less important data from the image, and make processing faster and usually more precise"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-15T19:11:14.311307Z",
     "start_time": "2021-03-15T19:11:14.305730Z"
    }
   },
   "outputs": [],
   "source": [
    "model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2), strides=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Flatten\n",
    "A layer flattening a multi-dimensional output into a single-dimensional output\n",
    "\n",
    "This layer is pretty simple, flattening our 2d grid into a single array. This is useful to help the next layer process the data more efficiently. The Dense layer that follows works best with 1 or 2 dimensional inputs which keeps the underlying matrix multiplications simple. It's able to understand the same assocations, since the images are all flattened according to the same algorithm"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-15T19:11:16.975843Z",
     "start_time": "2021-03-15T19:11:16.970357Z"
    }
   },
   "outputs": [],
   "source": [
    "model.add(tf.keras.layers.Flatten())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Dense\n",
    "A densely-connected classification layer\n",
    "\n",
    "Dense layers are the basic classification layers, they receive inputs, and determine which parts of the input are important to classify data based on the outputs at the end. It does this by using something called an \"activation function\". The activation function determines from the inputs how large of an output the neuron will have. This simulates the activation of neurons in the brain, with certain levels of signal in a neuron affecting how great of an electrical impluse is sent out to connected neurons. In this case, these neurons take a weighted sum of the inputs, and produce an output. The weights are then updated as part of an optimization function which we'll cover a bit later with techniques like Gradient Descent and Backpropagation\n",
    "\n",
    "To learn more about activation functions (a very key concept) and how they work, have a look at:\n",
    "https://medium.com/the-theory-of-everything/understanding-activation-functions-in-neural-networks-9491262884e0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-15T19:11:19.933355Z",
     "start_time": "2021-03-15T19:11:19.923832Z"
    }
   },
   "outputs": [],
   "source": [
    "model.add(tf.keras.layers.Dense(units=64, activation=tf.nn.relu))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Dropout\n",
    "A layer that randomly removes certain dense nodes from training to prevent overfitting\n",
    "\n",
    "Recall in the previous section where we mentioned overfitting. In order to prevent the AI from learning too many of the exact pecularities of the dataset there are several techniques broadly referred to as \"regularization\". This is one technique that is easy to apply to Keras layers. The dropout layer randomly removes a certain percentage of the previous layers from the network while training, to prevent them from becoming too specialized to the training dataset\n",
    "\n",
    "You can read more about regularization techniques at:\n",
    "https://towardsdatascience.com/regularization-in-machine-learning-76441ddcf99a"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-15T19:11:23.196057Z",
     "start_time": "2021-03-15T19:11:23.174326Z"
    }
   },
   "outputs": [],
   "source": [
    "model.add(tf.keras.layers.Dropout(rate=0.2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Dense (but with softmax activation this time)\n",
    "A classifier layer that outputs a max value\n",
    "\n",
    "In this output layer, we want to classify the work done by all the previous layers. To do so, we take a layer that has nodes representing each class, and take the maximum activation value. That is to say that whichever set of neurons from the previous network provided the greatest confidence in its class becomes the output. In the case of a 0, we would see node 0 having the highest \"activation\" across all of the neurons. In cases where it's close, such as having a .59 and .60 activation we still take the max, knowing that there will likely be some misclassifications in edge cases like that\n",
    "\n",
    "For some fun reading related to the misclassification based on close levels of activation, check out: https://medium.freecodecamp.org/chihuahua-or-muffin-my-search-for-the-best-computer-vision-api-cbda4d6b425d\n",
    "\n",
    "This article covers a set of issues related to misclassifying dogs and bagels (and a google search of this problem can reveal more fun instances of similar issues)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-15T19:11:26.640140Z",
     "start_time": "2021-03-15T19:11:26.631103Z"
    }
   },
   "outputs": [],
   "source": [
    "model.add(tf.keras.layers.Dense(10, activation=tf.nn.softmax))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we compile our model, using a given optimizer and loss function\n",
    "\n",
    "Our optimizer is the function or set of functions that determine how the model updates its weights as it trains. The loss function is how we calculate how accurate or inaccurate a result from training is. This loss function is minimized by using the optimization function.\n",
    "\n",
    "The techniques used to train the AI are called (broadly) Gradient Descent and Backpropagation. Essentially, the optimizer updates the weights, performs a training iteration, and then updates the weights to be more accurate based on how much they contributed to the correct or incorrect classification during training. The way it calculates the degree that each neuron calculates how much it contributed to the answer is using a technique called \"backpropagation\". The specific functions used can heavily affect how well the AI performs at a given task. In this case, we use the \"adam\" optimizer, and the \"sparse categorical crossentropy\" function to calculate the loss\n",
    "\n",
    "Finally, the metric we want to print out as we go through the training and testing is \"accuracy\". We defined this in the previous notebook as\n",
    "\n",
    "### $ Accuracy = {Correct\\_Predictions \\over Total\\_Predictions} $\n",
    "\n",
    "Explanations of optimization, loss, and gradient descent, tend to be somewhat mathematical explanations and rather than dive in further in this notebook, you can read about exactly how these are calculated here:\n",
    "\n",
    "https://towardsdatascience.com/gradient-descent-in-a-nutshell-eaf8c18212f0\n",
    "\n",
    "https://medium.com/datathings/neural-networks-and-backpropagation-explained-in-a-simple-way-f540a3611f5e\n",
    "\n",
    "https://skymind.ai/wiki/backpropagation\n",
    "\n",
    "The adam optimizer is a variant of Stochastic Gradient Descent, and has some benefits, you can read about them at https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/\n",
    "\n",
    "The loss functions can be read about at: https://blog.algorithmia.com/introduction-to-loss-functions/"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-15T19:11:34.035904Z",
     "start_time": "2021-03-15T19:11:34.021333Z"
    }
   },
   "outputs": [],
   "source": [
    "model.compile(optimizer='adam',\n",
    "              loss='sparse_categorical_crossentropy',\n",
    "              metrics=['accuracy'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can see a summary of the layers, and how they are connected with `model.summary()`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-15T19:11:37.220400Z",
     "start_time": "2021-03-15T19:11:37.215245Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Model: \"sequential\"\n",
      "_________________________________________________________________\n",
      "Layer (type)                 Output Shape              Param #   \n",
      "=================================================================\n",
      "conv2d (Conv2D)              (None, 26, 26, 32)        320       \n",
      "_________________________________________________________________\n",
      "max_pooling2d (MaxPooling2D) (None, 13, 13, 32)        0         \n",
      "_________________________________________________________________\n",
      "flatten (Flatten)            (None, 5408)              0         \n",
      "_________________________________________________________________\n",
      "dense (Dense)                (None, 64)                346176    \n",
      "_________________________________________________________________\n",
      "dropout (Dropout)            (None, 64)                0         \n",
      "_________________________________________________________________\n",
      "dense_1 (Dense)              (None, 10)                650       \n",
      "=================================================================\n",
      "Total params: 347,146\n",
      "Trainable params: 347,146\n",
      "Non-trainable params: 0\n",
      "_________________________________________________________________\n"
     ]
    }
   ],
   "source": [
    "model.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Training\n",
    "\n",
    "Finally we train the model with the features and labels we formatted earlier. Depending on the machine, this may happen very quickly or very slowly. Training a model in some more advanced cases could even take days, hence why the advancements in GPU performance have been so crucial in bringing AI into viability for solving many problems that were once thought intractable"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-15T19:12:07.606470Z",
     "start_time": "2021-03-15T19:11:45.483417Z"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2021-12-10 03:22:18.262814: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)\n",
      "2021-12-10 03:22:18.274964: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 2499995000 Hz\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 1/3\n",
      "1875/1875 [==============================] - 22s 12ms/step - loss: 0.4249 - accuracy: 0.8707\n",
      "Epoch 2/3\n",
      "1875/1875 [==============================] - 22s 12ms/step - loss: 0.1137 - accuracy: 0.9651\n",
      "Epoch 3/3\n",
      "1875/1875 [==============================] - 22s 12ms/step - loss: 0.0806 - accuracy: 0.9759\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<tensorflow.python.keras.callbacks.History at 0x7f8ef4fd4fd0>"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model.fit(train_features, train_labels, epochs=3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Testing\n",
    "\n",
    "Now that we have trained the model, we need to ensure that it will generalize to data that it hasn't seen before. In this case, we want the accuracy and loss to be fairly close to the values we saw at the end of the training. If they're not, that indicates that our model is probably overfitted to the training data to some extent, and won't perform well on data it hasn't seen before.\n",
    "\n",
    "Note that no AI is perfect, and this is a departure from traditional computer science where results tend to be either right or wrong, an AI can make mistakes. This tradeoff is important to understand, and is a reason why AI is not suitable for every problem. However, AI becoming more practical has opened up the ability to solve a vast set of problems that were once considered nearly impossible"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-15T19:12:19.216357Z",
     "start_time": "2021-03-15T19:12:18.587315Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "313/313 [==============================] - 1s 4ms/step - loss: 0.0654 - accuracy: 0.9794\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[0.06543874740600586, 0.9793999791145325]"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model.evaluate(test_features, test_labels)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Saving\n",
    "\n",
    "Finally, we want to save our model out, since we'll be reusing it in a later notebook\n",
    "\n",
    "There are many ways to save the model, but the most common are:\n",
    "* HDF5 format, a which saves the model as a single file with the configuration of the layers and weights included\n",
    "* JSON format, which saves just the configuration of the layers, it requires the weights to be saved separately\n",
    "* SavedModel format, a TensorFlow-specific layout involving a few directories. This format is required by the TensorFlow Serving server which allows you to easily serve the model to other systems"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-15T19:12:33.436780Z",
     "start_time": "2021-03-15T19:12:33.410827Z"
    }
   },
   "outputs": [],
   "source": [
    "# If you want to save the model in its current state in HDF5 format, you would use the following code syntax:\n",
    "\n",
    "model.save('mnist-model/mnist-model.h5', overwrite=True, include_optimizer=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Showing some examples\n",
    "\n",
    "We can also show some examples from the test dataset along with their predictions\n",
    "\n",
    "You can re-run this cell to select a different example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-15T19:12:44.717595Z",
     "start_time": "2021-03-15T19:12:44.565719Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Features:\n",
      "Predicted label: 6\n"
     ]
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAPsAAAD4CAYAAAAq5pAIAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/Z1A+gAAAACXBIWXMAAAsTAAALEwEAmpwYAAANmElEQVR4nO3db6hc9Z3H8c9nY+q/RIybawy3YW+3BuVmcdMyBqFSspYNKpGkT6QBY4RA+sA/LVZY6aIVfSK6bVlwrSSbmLh0DYH6J6DsJoaCFKRkDFlNYtoYuaGJMbnBB71BsJp898E92mu8c+Y6c+ZP7vf9gsvMnO+cOV+GfHJmzm/O+TkiBGD6+5teNwCgOwg7kARhB5Ig7EAShB1I4oJubmzu3LkxNDTUzU0CqYyMjOjUqVOerNZW2G3fLOnfJc2Q9J8R8XjZ84eGhlSv19vZJIAStVqtYa3lj/G2Z0j6D0m3SBqWtMr2cKuvB6Cz2vnOvkTSuxHxXkT8RdJWSSuqaQtA1doJ+6CkP014fLRY9gW219mu266Pjo62sTkA7ej40fiIWB8RtYioDQwMdHpzABpoJ+zHJC2Y8PjrxTIAfaidsO+WtND2N2x/TdIPJG2vpi0AVWt56C0iPrV9j6T/1fjQ26aI2F9ZZwAq1dY4e0S8KunVinoB0EH8XBZIgrADSRB2IAnCDiRB2IEkCDuQRFfPZ0dnHD58uGFt4cKFpes+8cQTpfUHHnigpZ7Qf9izA0kQdiAJwg4kQdiBJAg7kARhB5Jg6G0aeO655xrW7EmvKvy5gwcPVt0O+hR7diAJwg4kQdiBJAg7kARhB5Ig7EAShB1IgnH288CBAwdK64899ljDWrNx9uXLl7fUE84/7NmBJAg7kARhB5Ig7EAShB1IgrADSRB2IAnG2c8Dx44da3ndK6+8srR+2223tfzaOL+0FXbbI5LGJJ2R9GlE1KpoCkD1qtiz/1NEnKrgdQB0EN/ZgSTaDXtI2mH7TdvrJnuC7XW267bro6OjbW4OQKvaDfuNEfFtSbdIutv2d899QkSsj4haRNQGBgba3ByAVrUV9og4VtyelPSipCVVNAWgei2H3faltmd/dl/SMkn7qmoMQLXaORo/T9KLxfnSF0j674j4n0q6whds27at5XVrtfLR0BkzZrT82ji/tBz2iHhP0j9W2AuADmLoDUiCsANJEHYgCcIOJEHYgSQ4xbUPHDp0qLS+cePG0npENKw9+uijLfWE6Yc9O5AEYQeSIOxAEoQdSIKwA0kQdiAJwg4kwTh7H9ixY0dpvdm0y0NDQw1r11xzTSstYRpizw4kQdiBJAg7kARhB5Ig7EAShB1IgrADSTDO3gcGBwfbWn/RokUNa5dccklbr43pgz07kARhB5Ig7EAShB1IgrADSRB2IAnCDiTBOHsfOHz4cK9bQAJN9+y2N9k+aXvfhGVX2N5p+1BxO6ezbQJo11Q+xm+WdPM5yx6UtCsiFkraVTwG0Meahj0iXpf04TmLV0jaUtzfImlltW0BqFqrB+jmRcTx4v4HkuY1eqLtdbbrtuujo6Mtbg5Au9o+Gh/jswo2nFkwItZHRC0iagMDA+1uDkCLWg37CdvzJam4PVldSwA6odWwb5e0pri/RtLL1bQDoFOajrPbfl7SUklzbR+V9DNJj0vaZnutpCOSbu9kkzh/lc09PzIyUrpus/P8h4eHW2kpraZhj4hVDUrfq7gXAB3Ez2WBJAg7kARhB5Ig7EAShB1IglNc+8D4jxA7V++kTZs2ldbXrl3bsNZsKupmtmzZUlpfvXp1W68/3bBnB5Ig7EAShB1IgrADSRB2IAnCDiRB2IEkGGfvA83Gm9utt2PDhg2l9fvvv7+0XtZbsysXjY2NldaffPLJ0vq1117bsHb99deXrjsdsWcHkiDsQBKEHUiCsANJEHYgCcIOJEHYgSQYZ0ephx9+uLT+0UcfldZXrWp0cWLp2WefLV13z549pfWbbrqptH7DDTc0rJ05c6Z03emIPTuQBGEHkiDsQBKEHUiCsANJEHYgCcIOJOFuXnO8VqtFvV7v2vbOF++//35pfcGCBaX1Cy5o/HOJ1157rXTdyy67rLReq9VK62fPni2tHz16tGFt/vz5pes28/TTT5fW77333oa1F154oXTdFStWtNRTr9VqNdXr9UkvItB0z257k+2TtvdNWPaI7WO29xZ/t1bZMIDqTeVj/GZJN0+y/JcRsbj4e7XatgBUrWnYI+J1SR92oRcAHdTOAbp7bL9VfMyf0+hJttfZrtuuj46OtrE5AO1oNey/kvRNSYslHZf080ZPjIj1EVGLiFqzCwwC6JyWwh4RJyLiTESclbRB0pJq2wJQtZbCbnvimMn3Je1r9FwA/aHpOLvt5yUtlTRX0glJPyseL5YUkkYk/TAijjfbGOPsk/vkk09K60uXLi2tv/HGGw1rixYtKl33uuuuK61v3bq1tN5MJ88bb3YM6KqrrmpYGx4eLl139+7dpfWLLrqotN4rZePsTS9eERGTXX1gY9tdAegqfi4LJEHYgSQIO5AEYQeSIOxAElxKug/MnDmztP7UU0+V1pcvX96wtn///tJ1m9WbufPOO9tav5O6efr2+YA9O5AEYQeSIOxAEoQdSIKwA0kQdiAJwg4kwaWkp4GXXnqpYe2uu+4qXXdsbKzaZs5x6NChhrVml5K++OKLS+vtnOLa7LXLLoEtSZdffnlpvVfaupQ0gOmBsANJEHYgCcIOJEHYgSQIO5AEYQeS4Hz2aWDlypUNawcPHixdt9k4/M6dO1vo6K+uvvrqhrVml2NetWqyCxv/1enTp1vqSZJmzZpVWp8xY0bLr92v2LMDSRB2IAnCDiRB2IEkCDuQBGEHkiDsQBKMs09zZed0S9Irr7xSWh8cHCytNzunvMzHH39cWt+8eXPLr93MsmXLSuuzZ8/u2LZ7peme3fYC27+1fcD2fts/KpZfYXun7UPF7ZzOtwugVVP5GP+ppJ9ExLCkGyTdbXtY0oOSdkXEQkm7iscA+lTTsEfE8YjYU9wfk/SOpEFJKyRtKZ62RdLKDvUIoAJf6QCd7SFJ35L0e0nzIuJ4UfpA0rwG66yzXbddb+f7HYD2TDnstmdJ+o2kH0fEnyfWYvyqlZNeuTIi1kdELSJqAwMDbTULoHVTCrvtmRoP+q8j4oVi8Qnb84v6fEknO9MigCo0HXqzbUkbJb0TEb+YUNouaY2kx4vblzvSITqq2amczzzzTGn9vvvuK62fPXu2Ye2OO+4oXffIkSOl9WbDihdeeGHD2kMPPVS67nQ0lXH270haLelt23uLZT/VeMi32V4r6Yik2zvSIYBKNA17RPxO0qQXnZf0vWrbAdAp/FwWSIKwA0kQdiAJwg4kQdiBJDjFFaXKLlM9lTr6B3t2IAnCDiRB2IEkCDuQBGEHkiDsQBKEHUiCsANJEHYgCcIOJEHYgSQIO5AEYQeSIOxAEoQdSIKwA0kQdiAJwg4kQdiBJAg7kARhB5Ig7EAShB1IomnYbS+w/VvbB2zvt/2jYvkjto/Z3lv83dr5dgG0aiqTRHwq6ScRscf2bElv2t5Z1H4ZEf/WufYAVGUq87Mfl3S8uD9m+x1Jg51uDEC1vtJ3dttDkr4l6ffFontsv2V7k+05DdZZZ7tuuz46OtpetwBaNuWw254l6TeSfhwRf5b0K0nflLRY43v+n0+2XkSsj4haRNQGBgba7xhAS6YUdtszNR70X0fEC5IUESci4kxEnJW0QdKSzrUJoF1TORpvSRslvRMRv5iwfP6Ep31f0r7q2wNQlakcjf+OpNWS3ra9t1j2U0mrbC+WFJJGJP2wA/0BqMhUjsb/TpInKb1afTsAOoVf0AFJEHYgCcIOJEHYgSQIO5AEYQeSIOxAEoQdSIKwA0kQdiAJwg4kQdiBJAg7kARhB5JwRHRvY/aopCMTFs2VdKprDXw1/dpbv/Yl0Vurquzt7yJi0uu/dTXsX9q4XY+IWs8aKNGvvfVrXxK9tapbvfExHkiCsANJ9Drs63u8/TL92lu/9iXRW6u60ltPv7MD6J5e79kBdAlhB5LoSdht32z7D7bftf1gL3poxPaI7beLaajrPe5lk+2TtvdNWHaF7Z22DxW3k86x16Pe+mIa75Jpxnv63vV6+vOuf2e3PUPSHyX9s6SjknZLWhURB7raSAO2RyTVIqLnP8Cw/V1JpyU9FxH/UCx7QtKHEfF48R/lnIj4lz7p7RFJp3s9jXcxW9H8idOMS1op6S718L0r6et2deF968WefYmkdyPivYj4i6Stklb0oI++FxGvS/rwnMUrJG0p7m/R+D+WrmvQW1+IiOMRsae4Pybps2nGe/relfTVFb0I+6CkP014fFT9Nd97SNph+03b63rdzCTmRcTx4v4Hkub1splJNJ3Gu5vOmWa8b967VqY/bxcH6L7sxoj4tqRbJN1dfFztSzH+Hayfxk6nNI13t0wyzfjnevnetTr9ebt6EfZjkhZMePz1YllfiIhjxe1JSS+q/6aiPvHZDLrF7cke9/O5fprGe7JpxtUH710vpz/vRdh3S1po+xu2vybpB5K296CPL7F9aXHgRLYvlbRM/TcV9XZJa4r7ayS93MNevqBfpvFuNM24evze9Xz684jo+p+kWzV+RP6wpH/tRQ8N+vp7Sf9X/O3vdW+Sntf4x7pPNH5sY62kv5W0S9IhSa9JuqKPevsvSW9LekvjwZrfo95u1PhH9Lck7S3+bu31e1fSV1feN34uCyTBATogCcIOJEHYgSQIO5AEYQeSIOxAEoQdSOL/AW/oH9qSmTlsAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Select a random example\n",
    "selected_element = np.random.randint(0, 9999)\n",
    "\n",
    "features = test_features[selected_element].reshape((28,28))\n",
    "label = test_labels[selected_element]\n",
    "\n",
    "print(\"Features:\")\n",
    "plt.imshow(features, cmap=\"Greys\")\n",
    "\n",
    "#print(\"Actual label: \" + str(label))\n",
    "\n",
    "print(\"Predicted label: \" + str(model.predict(features.reshape((1,28,28,1)))[0].argmax()))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Congratulations!  You are able to take the image of a digit, and correctly classify it by outputting the correct digit.  This concludes the Tensorflow learning path."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.6"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
